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REMARKS 

This Response is filed within two months of the mailing date of the Final Office Action 
dated January 13, 2005. Claims 1-24 are pending, with claims 1, 11, 17 and 22 being the only 
independent claims. Claims 1, 2, 11, 17, 22 and 23 have been amended. No new matter has been 
added by way of the amendment. Reconsideration and withdrawal of rejections are respectfully 
requested. 

Claims 1-9, 11-15, 17 and 19-23 were rejected under 35 U.S.C. §103(a) as being 
unpatentable over U.S. Patent No. 6,460,056 ("Horir) and U.S. Patent No. 6,665,643 ("Lcrnde"). 
Claims 10, 16, 18, and 24 were rejected under 35 U.S.C. §103(a) as being unpatentable over Horii 
and Lcrnde, and further in view of "Text-driven automatic frame generation using MPEG-4 
synthetic/natural hybrid coding for 2-D head-and-shoulder scene" ("TDAFG"). 

Claims 1 and 1 1 have been amended to recite the steps of processing [an] audio/video 
signal to generate an isolated audio component signal; isolating the speech component from the 
isolated audio component signal; and rendering an animation image on a portion of the monitor 
based on the animation signal generated from said animation model parameters." These claim 
limitations were previously in claim 2; therefore these amendments do not raise any new issues 
requiring further research or consideration. Support for the amendments may be found at page 8, 
lines 3-10 and page 9, lines 1-4 of the specification). No new matter has been added. 

The invention relates to the display of a sign language animation image corresponding to a 
speech component of an audio/video (A/V) signal. Specifically, the sign language animation image 
is displayed simultaneously with a visual image corresponding to a video component of the 
audio/video signal. This functionality is accomplished by isolating a speech component from the 
audio signal components of the A/V signal. The isolated audio component signal is processed to 
obtain the speech components of the A/V signal. These speech components are mapped to a sign 
language animation model to generate animation model parameters which correspond to sign 
language images. An animation signal is generated using the animation model parameters. The 
resulting animation model parameters are then transmitted along with the A/V signal to a monitor 
display, wherein an animation image is rendered from the animation signal on a monitor display 
screen based on the animation signal generated from the animation model parameters (see page 4, 
lines 1-16 of the specification). 

In contrast, Horii relates to an image display method and apparatus for displaying sign 
language images corresponding to speech (see col. 1, lines 12-14). According to Horii, image 
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data (such as sign language images) are stored in an image dictionary in motion picture form. 
Document data is read out from a character information storage device (or speech data is 
received), and a sign language image corresponding to a character string of the document data 
(or the speech) is selected from the image dictionary and displayed on a display (see Abstract). 
However, Horri fails to teach, Inter alia, the steps of "processing [an] audio/video signal to 
generate an isolated audio component signal; isolating ... speech components from the isolated 
audio component signal ... and ... rendering an animation image on a portion of [a] monitor 
based on the animation signal generated from said animation model parameters, [wherein the] 
animation image [contains] sign language gestures corresponding to the speech component of the 
audio/video signal," as recited in amended independent method claims 1 and 1 1 . 

Horii teaches a microphone input terminal 1 1 of a speech recognizer (see Fig. 3 and Fig. 4). 
Horii states the voice signal input from the microphone input terminal 1 1 is amplified by the 
amplifier 12, and recognized by the speech recognizer 13 (see col. 4, lines 14-17). Horii also 
teaches a video input terminal 21 of a video input processor 22 (see Fig. 4). In each of the systems 
described in Horii, the speech signal is separate from the video signal. That is, there is no 
"processing [of an] audio/video signal to generate an isolated audio component signal," as 
recited in amended independent method claims 1 and 1 1 . 

With reference to Fig. 1 of the present invention, an A/V separator block 12 is provided 
for separating or splitting an input A/V signal and outputting at least two outputs. The first 
output provides the complete unaltered A/V signal. The other output provides only the audio 
component of the A/V signal (see pg. 8, lines 4-8 of the specification). Once the audio 
component is separated from the A/V signal, a speech isolator block 14 is then is then used to 
identify and isolate the speech component from the remainder of the audio signal (see pg. 8, lines 
8-10 of the specification). Horii fails to teach or suggest the step of processing an A/V signal to 
generate an audio component signal, as recited in independent method claims 1 and 1 1 . 

Lande relates to a method and apparatus for receiving information items and for applying 
appropriate, geometric deformations to any facial model complying with the MPEG-4 standard 
(see col. 2, lines 32-35). Lande discloses the splitting of information characterizing the position of 
the speakers mouth into groups of parameters characterizing mouth shape and positions of lips and 
the jaw of a face model (see col. 2, lines 52-58). Lande discloses the analyzing an speech signal to 
animate facial expressions (i.e., the lips and jaw of a face), as opposed to animating sign language. 
Lande fails to cure the deficiencies of Horii. Specifically, Lande combined with Horii fails to teach 
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or suggest the steps of processing [an] audio/video signal to generate an isolated audio 
component signal; isolating ... speech components from the isolated audio component signal ... 
and ... rendering an animation image on a portion of [a] monitor based on the animation signal 
generated from said animation model parameters, [wherein the] animation image [contains] sign 
language gestures corresponding to the speech component of the audio/video signal " as recited 
amended independent method claims 1 and 11." 

TDAFG has been cited as teaching the use of SNHC to generate animation parameters. 
However, TDAFG also fails to cure the deficiencies of the system defined by the combination of 
Horii and Lande, because the initial step of processing an A/V signal to generate an audio 
component signal is also not disclose in TDAFG. In view of the foregoing, amended independent 
method claims 1 and 11 are patentable over the combination of Horii, Lande, and TDAFG, 
Consequently, reconsideration and withdrawal of the rejections under 35 U.S.C. §103(a) is in order, 
and a notice to that effect is requested. 

Independent claims 17 and 22 are system claims associated with the implementation of 
independent method claims 1 and 1 1, respectively. Accordingly, independent system claims 17 and 
22 are also patentable over the combination of Horii, Lande, and TDAFG for the reasons discussed 
above with respect to independent method claims 1 and 1 1 . 

hi view of the patentability of independent claims 1, 11, 17, and 22, for the reasons above, 
dependent claims 2-10, 12-16, 18-21, 23 and 24 are all patentable over the prior art. 

Applicants submit that the amendment to the claims and the arguments herein do not raise 
news issues that would require further search. Applicants request entry of this amendment and 
submit that this application is in condition for allowance. Early passage of this case to issue is 
requested. 

Respectfully submitted, 

COHEN, PONT AM, LIEBERMAN & PAVANE 



By 




Reg. No. 35,698 

551 Fifth Avenue, Suite 1210 

New York, New York 10176 

(212)687-2770 

Dated: March 11, 2005 

PACE 10/10 * RCVD AT 3/1 1/2005 3:44:55 PM [Eastern Standard Time] * SVR:USPTO-EFXRF-1/0 * DN18:872fl306 * C8ID: ' DURATION (mm-ss): 04-10 

BEST AVAILABLE COPY 



